uncertain point
Reviews: Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss
This paper provides theoretical analysis and empirical examples for two phenomenon in active learning. The first is it could be possible that the 0-1 loss on subset of the entire dataset generated uncertainty sampling is smaller than learning using the whole dataset. The second is uncertainty sampling could "converge" to different models and predictive results. In the analysis, it is shown that the reason for these is the expected gradient of the "surrogate" loss of the most uncertain point is in the direction of the gradient of the current 0-1 loss. This result is based on the setup that the most uncertain point is sampled from a minipool that is a subset sampled without replacement randomly from the entire set.
Fine-grained Uncertainty Modeling in Neural Networks
Soni, Rahul, Shah, Naresh, Moore, Jimmy D.
Existing uncertainty modeling approaches try to detect an out-of-distribution point from the in-distribution dataset. We extend this argument to detect finer-grained uncertainty that distinguishes between (a). certain points, (b). uncertain points but within the data distribution, and (c). out-of-distribution points. Our method corrects overconfident NN decisions, detects outlier points and learns to say ``I don't know'' when uncertain about a critical point between the top two predictions. In addition, we provide a mechanism to quantify class distributions overlap in the decision manifold and investigate its implications in model interpretability. Our method is two-step: in the first step, the proposed method builds a class distribution using Kernel Activation Vectors (kav) extracted from the Network. In the second step, the algorithm determines the confidence of a test point by a hierarchical decision rule based on the chi-squared distribution of squared Mahalanobis distances. Our method sits on top of a given Neural Network, requires a single scan of training data to estimate class distribution statistics, and is highly scalable to deep networks and wider pre-softmax layer. As a positive side effect, our method helps to prevent adversarial attacks without requiring any additional training. It is directly achieved when the Softmax layer is substituted by our robust uncertainty layer at the evaluation phase.
Probabilistic Sparse Subspace Clustering Using Delayed Association
Jaberi, Maryam, Pensky, Marianna, Foroosh, Hassan
Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into two separate stages of finding the similarity matrix and finding clusters. Similar to some recent works, we integrate these two steps using a joint optimization approach. We make the following contributions: (i) we estimate the reliability of the cluster assignment for each point before assigning a point to a subspace. We group the data points into two groups of "certain" and "uncertain", with the assignment of latter group delayed until their subspace association certainty improves. (ii) We demonstrate that delayed association is better suited for clustering subspaces that have ambiguities, i.e. when subspaces intersect or data are contaminated with outliers/noise. (iii) We demonstrate experimentally that such delayed probabilistic association leads to a more accurate self-representation and final clusters. The proposed method has higher accuracy both for points that exclusively lie in one subspace, and those that are on the intersection of subspaces. (iv) We show that delayed association leads to huge reduction of computational cost, since it allows for incremental spectral clustering.
Clustering Dynamic Spatio-Temporal Patterns in The Presence of Noise and Missing Data
Chen, Xi (University of Minnesota) | Faghmous, James H. (University of Minnesota and Mt. Sinai School of Medicine) | Khandelwal, Ankush (University of Minnesota) | Kumar, Vipin (University of Minnesota)
Clustering has gained widespread use, especially for static data. However, the rapid growth of spatio-temporal data from numerous instruments, such as earth-orbiting satellites, has created a need for spatio-temporal clustering methods to extract and monitor dynamic clusters. Dynamic spatio-temporal clustering faces two major challenges: First, the clusters are dynamic and may change in size, shape, and statistical properties over time. Second, numerous spatio-temporal data are incomplete, noisy, heterogeneous, and highly variable (over space and time). We propose a new spatio-temporal data mining paradigm, to autonomously identify dynamic spatio-temporal clusters in the presence of noise and missing data. Our proposed approach is more robust than traditional clustering and image segmentation techniques in the case of dynamic patterns, non-stationary, heterogeneity, and missing data. We demonstrate our method's performance on a real-world application of monitoring in-land water bodies on a global scale.